Article 3124

Title of the article

Variants of Zipf’s hyperbolic law for fractal description of probability-rank distribution of letters in English technical texts 

Authors

Aleksandr I. Ivanov, Doctor of engineering sciences, professor, scientific adviser, Penza Scientific Research Electrotechnical Institute (9 Sovetskaya street, Penza, Russia), E-mail: ivan@pniei.penza.ru
Aleksey P. Ivanov, Candidate of engineering sciences, associate professor, head of the sub-department of technical means of information security, Penza State University (40 Krasnaya street, Penza, Russia), E-mail: ap_ivanov@pnzgu.ru
Aleksey P. Yunin, Lead expert, Penza Scientific Research Electrotechnical Institute (9 Sovetskaya street, Penza, Russia), E-mail: unin_ap@pniei.penza.ru
Roman V. Eremenko, Senior lecturer of the sub-department radio and satellite communications, Military Training Center, Penza State University (40 Krasnaya street, Penza, Russia), E-mail: tsib@pnzgu.ru 

Abstract

Background. Extension of the scope of the classical hyperbolic Zipf’s law. Previously, this law was used to statistically describe the probability of occurrence of words in texts in one of the European languages. Materials and methods. The letters are sorted according to the probability of their use in English texts with a length of 3799 characters. Results. It is shown that the statistics of frequently used small letters are easily separable from the statistics of large letters. Even on short texts, ordering letter encodings by their probability of occurrence yields another hyperbolic Zipf distribution for letters and punctuation marks. Conclusions. The distribution of the lengths of words marked with spaces on both sides according to the Mandelbrodt method is given. Additionally, the selection of words between the ASCII codes “101” corresponding to the English letters “e” is given. It is shown that for all other frequently used letters of the English language “t”, “a”, “i”, “r” the corresponding functionals with linear computational complexity can be constructed. As a result, we get a number of new statistical functionals for a deeper analysis of texts in European languages. New statistical functionals can be used to evaluate the strength of long passphrases. 

Key words

distribution law, Zipf’s hyperbolic word distribution law, hyperbolic letter distribution law 

Download PDF
For citation:

Ivanov A.I., Ivanov A.P., Yunin A.P., Eremenko R.V. Variants of Zipf’s hyperbolic law for fractal description of probability-rank distribution of letters in english technical texts. Izvestiya vysshikh uchebnykh zavedeniy. Povolzhskiy region. Tekhnicheskie nauki = University proceedings. Volga region. Engineering sciences. 2024;(1):39–47. (In Russ.). doi: 10.21685/2072-3059-2024-1-3

 

Дата создания: 19.06.2024 11:02
Дата обновления: 27.06.2024 14:23